A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotide distribution
نویسندگان
چکیده
MOTIVATIONS The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribution such as the GC-content in their sequences. However, the latter is an important criterion for large-scale applications as it could presumably be used to design sequences with better transcription rates and/or structural plasticity. RESULTS In this article, we introduce IncaRNAtion, a novel algorithm to design RNA sequences folding into target secondary structures with a predefined nucleotide distribution. IncaRNAtion uses a global sampling approach and weighted sampling techniques. We show that our approach is fast (i.e. running time comparable or better than local search methods), seedless (we remove the bias of the seed in local search heuristics) and successfully generates high-quality sequences (i.e. thermodynamically stable) for any GC-content. To complete this study, we develop a hybrid method combining our global sampling approach with local search strategies. Remarkably, our glocal methodology overcomes both local and global approaches for sampling sequences with a specific GC-content and target structure. AVAILABILITY IncaRNAtion is available at csb.cs.mcgill.ca/incarnation/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
A weighted sampling algorithm for the design of RNA sequences with targeted secondary structure and nucleotides distribution
Motivations: The design of RNA sequences folding into predefined secondary structures is a milestone for many synthetic biology and gene therapy studies. Most of the current software uses similar local search strategies (i.e. a random seed is progressively adapted to acquire the desired folding properties) and more importantly do not allow the user to control explicitly the nucleotide distribut...
متن کاملAn Adaptive Defect Weighted Sampling Algorithm to Design Pseudoknotted RNA Secondary Structures
Computational design of RNA sequences that fold into targeted secondary structures has many applications in biomedicine, nanotechnology and synthetic biology. An RNA molecule is made of different types of secondary structure elements and an important RNA element named pseudoknot plays a key role in stabilizing the functional form of the molecule. However, due to the computational complexities a...
متن کاملRelation Between RNA Sequences, Structures, and Shapes via Variation Networks
Background: RNA plays key role in many aspects of biological processes and its tertiary structure is critical for its biological function. RNA secondary structure represents various significant portions of RNA tertiary structure. Since the biological function of RNA is concluded indirectly from its primary structure, it would be important to analyze the relations between the RNA sequences and t...
متن کاملPreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars
Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...
متن کاملPhylogenetic Analysis of Beta-Glucanase Producing Actinomycetes Strain TBG-CH22 - A Comparison of Conventional and Molecular Morphometric Approach
Actinomycetes are inexhaustible producers of commercially valuable metabolites, are continually screened for beneficial compounds. The taxonomic and phylogenetic study of novel actinomycetes strains are mostly based on conventional methods and primary DNA structure of 16s rRNA. Although 16s rRNA sequence is well accepted in phylogeny studies, its secondary structures have not been widely used. ...
متن کامل